When learning a language other than one's mother tongue, there are cases in which pronunciation is learned by listening to audio. At this time, if text corresponding to the audio is displayed, it is easy for a user to grasp the content of the audio. For example, Patent Literature 1 discloses a playback device that can search a playback position in video based on subtitles added to the video. This playback device can repeatedly perform playback based on the subtitles. Therefore, sections that are difficult to hear can be repeatedly played back, thus enhancing learning effects.